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1. INTRODUCTION 

The air pollution problem has become a significant issue that impacts human health. The pollution 
problem causes many diseases such as coughing, sneezing, asthma, bronchitis, and lung cancer. Several 
factors can exacerbate the problem, such as urbanization, industrial plants, agricultural burning, construction, 
and forest fires. Especially, burning for planting preparation is the most important factor of this problem in 
the Thailand context [1]. Particulate matter (PM) 2.5 is often used to indicate air quality. Therefore, the 
development of efficient forecasting and monitoring the PM2.5 concentrations are reasonable solutions for 
planning and determining measures in the government sector. An efficient forecasting algorithm with high 
accuracy is still a challenging task. Many previous works have suggested using forecasting methods to 
predict PM2.5 concentration in time-series prediction [2]. They presented various models of forecasting such 
as conventional model, artificial intelligence model, and hybrid model to achieve the problem [3]. In 
addition, the forecasting results were effectively used for planning the duration of the outdoor activities of all 
people. 


Journal homepage: http://ijai.iaescore.com 


1298 O ISSN: 2252-8938 


The forecasting model is widely used as a tool for planning and monitoring resources usage. The 
model also applies to increase the ability to decision-making in various areas such as financial, marketing, 
network, energy, and environment. Several approaches to achieve the forecasting problem are grouped into 
the following models: conventional models, artificial intelligence models, and hybrid models. The 
conventional model is classified into two main groups: statistical methods and deterministic methods. The 
traditional statistic can achieve to solve linear and non-linear problems with high accuracy [4]—[6]. Some 
approaches efficiently attain the forecasting problem, such as non-linear regression [7], [8], autoregressive 
integrated [9]-[11], extended Kalman filter [12], and exponential smoothing [13]. The conventional model of 
a deterministic approach is another approach that does not require historical data, but these approaches need 
sufficient basic information [14]. The deterministic model is often applied to estimate the degree of pollution 
problem accurately [15]. The statistical approach delivers more accuracy of long-term forecasting than the 
deterministic approach, although those approaches need more computational resources than others. The 
statistical method generally requires a lot of historical atmospheric data that have extreme dependencies on a 
specific site. Therefore, the statistic approach forecasts the pollution concentrations accurately. Many 
efficient techniques, for example, artificial intelligence and hybrid method, are often applied to enhance the 
accuracy of forecasting. 

The artificial intelligence model attains to the weather forecast and air quality prediction with high 
accuracy. This approach consists of artificial neural networks [16], [17], machine learning [18], [19], deep 
learning [20], [21], evolutionary methods [22]. This approach does not only give better forecasting accuracy 
than mathematical methods, but it also consumes computing resources lower than the conventional approach. 
This approach can solve highly complex problems that are difficult to construct the theoretical models. Thus, 
artificial intelligence (AI) is widely applied to achieve a forecasting problem of a complex system such as 
PM2.5. Many previous works studies forecasting PM2.5 technique based on artificial neural network (ANN) 
approach including recursive neural network [14], fuzzy neural network [23], convolutional neural network 
(CNN) [24], back propagation neural network [25], and adaptive neural network [26]. However, the ANN 
technique achieves the local optimum problem and the global optimum problem [27]. AI with evolutionary 
algorithms is one of the attractive approaches to forecasting problems such as particle swarm 
optimization(PSO), and genetic algorithm (GA) [28]-[30]. 

The hybrid forecasting model is the primary approach to improve the performance of forecasting in 
various fields. This approach is not only used to accomplish a long-term nonlinear problem, but it also 
enhances forecasting accuracy with the same performance as the other methods mentioned above. The 
previous works studied the hybrid technique includes cluster-based [31], ANN and multiple linear regression 
[32], ANN and k-means clustering [33], ANN and wavelet [14], back propagation ANN (BPANN) and 
wavelet [34], BPANN and adaptive differential evolution [27], deep learning and wavelet [34], recurrent 
neural network (RNN) and long short-term memory (LSTM) [35], multi-objective Harris hawk’s 
optimization (MOHHO) [36]. The hybrid models can help to enhance the accuracy of PM2.5 prediction and 
the model also achieve the limitation of single-site prediction to generalized [37]. The hybrid approach [38] 
can also be applied to forecasting complex problems in other fields. 

The proposed algorithms of forecasting are implemented based on historical prediction. The 
forecasting algorithms combine traditional neural network prediction in the first step, and the prediction 
result from this step has enhanced the accuracy of the result in the second step. In the improving accuracy 
step, the forecasting performance obtains from comparing the predictive data of the neural networks to the 
real measurement value. These performances are used as the weighted factor to improve the accuracy of final 
forecasting. The experimental results showed that the proposed algorithms can improve the accuracy from 
the traditional approach significantly. The main contributions of this work are as: i) The hybrid forecasting 
algorithms are proposed to increase the predictive accuracy using the historical prediction technique; and ii) 
The use of the lightweight technique applied with conventional artificial intelligence can efficiently improve 
the accuracy of forecasting problems. 


2. METHOD 

The algorithms proposed in this work utilize the benefits of neural networks for complex systems 
based on historical forecasted data, consisting the lightweight techniques to improve forecast accuracy. This 
technique can be applied to conventional time-series data. There are models used in this research consisting 
of the data model, multi-layers perceptron of the neural network, proposed algorithms, and performance 
evaluation model. The detail of each model is addressed as follows. 
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2.1. Dataset description 

The dataset of PM2.5 concentration used as the main factors, included temperature, dew point, 
humidity, and wind speed, affecting PM2.5 levels in Chiang Rai province, the northern province of Thailand, 
were collected from BerkeleyEarth and WeatherUnderground from 2017 to 2020. Chiang Rai is an 
agricultural city encountering severe smog, especially in the planting season over the past decades. This 
study used data collected daily over three years. Those data are divided into two parts: i) learning data set 
from 2017 to 2018, and ii) testing data set from 2019 to mid-2020. This set is used to evaluate the forecasting 
performance of the proposed algorithms. 


2.2. Neural network model 

In this work, we apply the neural network model based on a multi-layer perceptron (MLP) neural 
network. The multi-layer perceptron model can deliver efficient results with non-linear problems and also 
work well with large input data. Normally, the neural network model is characterized by three main layers: 
input layer, hidden layer, output layer. The input layer is used as the layer that obtained input parameters 
from the external environment. Next, the hidden layer is the middle part of the neural network in which this 
layer may be composed of several hidden layers. In this layer, the weighted value of each factor affecting the 
forecast value is calculated. The output layer performs as a node for collecting all the results from hidden 
layer before submitting the solution. The neural network model applied in this work shown in (1). The input 
layer of the model uses set (X) as the input parameters, consisting of (x;) represented by each factor for 
processing in the learning process. 


¥, = Dil wih + Bi (1) 


The predicted value (Y;) is calculated by the combination of the weight (wi) of the hidden node (j) of the 
layer (1). Next, this layer sent the output to hidden nodes (i) of the next layer and bias (6;) of hidden nodes 
(i). Let Hj denote the hidden node (j) of the layer (/) can be expressed as H; = wan + Bi. The 
hidden layer is recursively computed from weighting (wij *) the previous layer (I — 1) to the hidden node (/) 
in the layer (J), and combined with a bias (B}) of the hidden node (i). The Hj} (as computed by Hj"* = 
a WH Xi + B}) presents the model for each node of the first layer. The node of the first layer (H}) can be 
applied by weighting (wi) of each input parameter (j) to hidden nodes (i). 

An empirical approach is applied to attain the best effort of neural network structure, including the 
number of hidden nodes and the number of the hidden layer. The experimental settings of nodes in each layer 
and hidden layers are set of 6-30 nodes and set of 1-4 layers, respectively. We also trained the model with 
200 to 1400 training cycles with the number of nodes and the number of a hidden layer as we mentioned 
above. The best model obtained from the model has a minimum of mean absolute error (MAE). The 
forecasting model of a neural network based on best-effort testing is consisting four hidden layers, each layer 
containing 28 nodes. This configured neural network structure is used as a preliminary prediction result to 


apply those forecasting results to a more accurate prediction by the proposed algorithms presented in this 
research. 


2.3. Proposed algorithms 

In this section, the detail of each algorithm applied by the historical prediction technique is 
described. The algorithms obtain historical predictive data for improving the accuracy of the forecasting. 
The preliminary result of the neural networks is used as baseline forecasting. Before the final solution is 
presented, the candidate solution must be improved accuracy with the proposed algorithm as shown in 
Figure 1. The main concept of the proposed algorithms adopted historical predictive data of both long-term 
and short-term forecasting. The historical data, obtained by computing relative error from the previous 
prediction, uses as a weighting factor in the second step. 


2.3.1. Coefficient weighted algorithm 

The preliminary predictive results of the neural network were considered to calculate the correlation 
coefficient from the long-term historical data. The correlation coefficient of the measurement data is 
estimated by the average forecasting performance with a linear regression technique. This algorithm is called 
the coefficient weighted (CW). The main idea of this method is to attempt to adjust the forecast coefficient 
each time to 1.0 because the forecast value is closer to the actual measured value. In Figure 1(a), the target of 
forecasting is equal to Y, = Y;, where Y¢ is the goal of the predicted value that needs to be close to the 
measured value, and Y; is the predicted result of the neural network. The linear regression is consisting of a 
regression coefficient w, and a correlation ¢,. The relation of w; and xg shown in (2), where X¢ is the 
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measured value, c,is the regression coefficient of the trendline and b, is the correlation of the target 
prediction. Finally, the improved accuracy of forecast value (Y;) is computed by the regression coefficient 
(Wz), and correlation (€,) that is weighted by the coefficient of the long-term historical accuracy (as shown 
in (3)). The coefficient weighted (w;) is obtained from the inversed regression coefficient 1/c,; and €; also 
calculated from the correlation is weighted by the coefficient b;/c¢. 


Yjxt be 
xg = Ue (2) 


Y;= oY te 3) 


2.3.2. Latest recently measured algorithm 

Next, the forecasting algorithm uses an improved performance technique based on weighting with 
the recently measured actual data and the previous forecast error. The recently measurement of PM2.5 level 
is a short-term historical weighting technique. The forecasting algorithm can efficiently predict a PM2.5 
concentration in unstable conditions. This proposed algorithm is called the latest recently measured (LRM). 
As shown in Figure 1(b), the improving algorithm obtains the forecast result (represented by a circle with a 
dotted line in t) from the last measured value (represented by a circle with a solid line in t — 1 and weighting 
the predicted error from the previous forecasting (in t— 1). The improved result Y‘, as shown in (4), is 
computed from recently measure of PM2.5 (M‘) and relative error of latest prediction (a‘) (as presented 
in (5)). The current result of neural network (Y‘) is weighted by the lastest relative error of forecasting before 
the final result is presented. 


yt = Mf 4 qt-ty¢t (4) 
yt_mt 
af = 2 (5) 
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Figure 1. Concept of the proposed algorithms (a) CW and (b) LRM 


2.3.3. Mixed CW and LRM algorithm 

The main idea of mixed CW and LRM obtains from the strength of two previous algorithms. 
Normally, the coefficient weighted algorithm can efficiently forecast the PM2.5 concentration of stable 
condition. On the other hand, the latest recently measured technique gives more accurate results of 
forecasting in inclement conditions. 
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Therefore, we developed the mixed coefficient weighted and latest recently measured to improve 
accuracy. The algorithm needs to uses the average PM2.5 value (€) in normal conditions as the threshold of 
result selection as shown in (6). The latest measured (M‘~1) PM2.5, compared with the threshold before the 
forecasting result is deployed, uses as criteria for selecting the solution. In the next section, the experimental 
results of each algorithm are discussed. 


2.4. Performance evaluation model 

For forecasting problems, there are many measurement metrics to evaluate the accuracy of the 
predicted results. The first metric used in this work is MAE. The MAE indicates the absolute difference in 
mean error between the forecasted values and the actual measured values. It is defined as MAE = 


1uUN e : boat : 
yuist |m; — p;|. Another measurement metric used for comparing the prediction performance is root mean 


square error (RMSE). This metric is one of the most frequently used to measure the forecasted values 
compared to the actual measured value because the metric is highly sensitive to large errors. The RMSE is 


resented by RMSE = (~Y‘“_,(p,; — m,)2)¥/2. N denotes the number of the collected data, p, is the predicted 
P y bea i — Mm; p; is the p 


value, and m; is the actual measured value. Let m, be the average of the actual measured value. The 
correlation factor (R?) is defined as R? = )_,(p; —m) /¥_,(m; — m)*. The measurement metric (R?) 
applies for indicating the amount of systemic error of the proposed model. In the next section, the proposed 
algorithms for improving the accuracy of forecasting were presented. 


2.5. Example case 

In this subsection, the algorithmic details of improving the foresting results are briefly presented. 
Firstly, we input the factor parameters to the learned model from the configuration described in section 2.1.2. 
Next, the CW algorithm tested the predictive efficiency of the neural network from the learning dataset to 
find the correlation coefficient in this example given equal 0.8 (c,) (thus w,; was 1.25), and given €,=9.5. 
Given that on day 1 the forecast value from the neural network is 60.5 (Y‘), therefore, after adjusting with (3) 
of the CW algorithm, the forecast result of PM2.5 concentration is 85.13. For the LRM algorithm, the 
forecast values from the neural network (Y'=60.5) are weighted based on the previous forecast error (as 
shown in (5)), a‘~1=0.26, combined with the previous measurement M‘~1=83, thus the forecast value of 98.8 
(as calculated in (4)). Given that the value € is equal to 44.5, the Mixed CW-LRM choose the solution from 
LRM as the forecast value. In this case, we have taken an example from a one record of the test dataset, in 
which case the PM concentration is 94. In the next section, the performance of forecasting of the proposed 
algorithms are presented. 


3. RESULTS AND DISCUSSION 

The neural network model was built on Keras, a high-level neural network API running on 
Tensorflow, and applied the proposed algorithms to improve forecasting accuracy with the discrete-event 
simulation. The predictive performance of the proposed algorithms is presented in Figure 2. Those 
performances are obtained by correlating the actual measured value (x-axis value) to the forecast value (y- 
axis value). The correlation coefficient should be close to 1 (r=1.0), meaning a highly accurate forecast. A 
testing data set that contains 460 time series, is used to evaluate the effectiveness of the forecast. The 
experimental results presented in Figure 2 show that the forecasting of the neural network is highly accurate 
in the PM2.5 range at average levels, but it gives lower accuracy during the PM2.5 is in high range and quite 
inconstant. The forecast performance of the high range (at the tail of the graph) deviated from the trend line, 
resulting in the average forecast performance of the neural network with a correlation coefficient of 0.58. 
Therefore, the proposed CW algorithm is used to adjust the forecast values before giving the final forecast 
results. The forecast values were adjusted with a new correlation coefficient obtained from the inverse 
correlation coefficient of the neural network (1/c;) as in (3). The experimental results of the CW algorithm 
resulted in a 25% higher forecasting accuracy, as it was able to improve forecasts in the unsteady range of 
high PMs for a short period of time effectively. The experiment results of the CW showed that the forecast, 
especially at high PM2.5, tends to move closer to the trendline with a relative coefficient of 0.75. However, 
the concentration of PM2.5 is continuous under normal conditions, observed as the level of PM2.5 may 
gradually increase or decrease significantly over time. The LRM method prioritizes the most recent measured 
value while using the previous forecast error to weight the forecast value from neural network. The results of 
LRM show that it gives a more accurate forecast PM2.5 concentration with a relative coefficient of 0.85. The 
average forecast results are more accurate from the neural network, CW at 32.5% and 12.7%, respectively. 
Finally, the experimental results of the mixed CW-LRM, a method that combines the advantages of both CW 
and LRM for predictive final tuning, are presented. This method uses ¢ as the criteria for selecting values 
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from previous measured (M‘~*). The ¢ used as a criterion to select the forecast result was computed from the 
mean of the PM2.5 concentration levels in each area. As a result, selecting LRM forecasts with advantages 
that can forecast more normal values of the PM2.5 level, the results after mixed CW-LRM have a forecast 
result with better average forecast accuracy than LRM at 9.4%, with an increase from forecasting with the 
neural network at 38.9%. In conclusion, all three methods presented require preliminary forecasting data 
from the neural network, and then the methods can be used to fine-tunes the forecast values to be more 
accurate. As presented in Figure 3, the time series plot and forecasting performance of each algorithm are 
presented. The forecasting results of the proposed algorithms that are illustrated with the time series plot, 
compared to the measurement of PM2.5 concentrations with 460 test data are shown in Figure 3. 
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Figure 2. Forecasting results of Neural network, CW, LRM, and Mixed CW-LRM 
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Figure 3. Performance evaluation result of the proposed algorithms 


The experimental result of the forecasting accuracy as shown in Table 1, found that the CW 
algorithm had a better correlation coefficient than the neural network method, but MAE and RMSE were 
14.90 and 22.94, respectively, which this algorithm was accurate less than the forecasting accuracy with 
neural networks. Next, the experimental results of the LRM algorithm show that the algorithm can deliver 
more forecasting accuracy than the CW algorithm. The use of the latest recently measured algorithm gives 
MAE and RMSE values are lower at 26% and 45%, respectively, compared to the CW algorithm. The mixed 
CW-LRM algorithm combines the advantages of CW and LRM algorithms to increase forecast accuracy. The 
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algorithm has an acceptable forecasting performance, and the forecasting results of the algorithm can reduce 
MAE and RMSE from LRM by 31% and 21% respectively. 


Table 1. Error of the proposed algorithms 


Algorithm MAE RMSE 
Neural Network 11.74 17.96 
CW 14.90 22.91 
LRM 10.95 12.65 
Mixed CW-LRM 7.53 9.94 


The performance of three proposed algorithms with the MAE and RMSE values are summarized. 
The MAE values of the CW algorithm, LRM algorithm, and mixed CW-LRM are 14.90, 10.95, and 7.53, 
respectively. The RMSE values are equal to 22.91, 12.65, and 9.94 respectively as shown in Table 1. Next 
section, the conclusion of the paper is presented. 


4. CONCLUSION 

The forecasting model of PM2.5 concentration is challenging research. An efficient model is used to 
accurately predict the level of PM2.5 for determining the period of outdoor activities. The model also helps 
the government sector in determining the duration of burning in the harvest season. In this work, three 
forecasting algorithms, combined with the neural networks based on historical prediction data, are proposed. 
The results of the traditional neural networks provide the acceptable accuracy of forecasting PM2.5 
concentrations with the configuration structure where a minimum MAE value consists of four hidden layers 
of 28 nodes for each layer. In the improving accuracy step, we propose the CW, LRM, and Mixed CW-LRM 
algorithm to improve the forecasting accuracy. The correlation coefficients of the three proposed algorithms 
were 0.75, 0.86, and 0.95, respectively. The MAE and RMSE were downward trends compared to the results 
of neural network forecasting. The mixed CW-LRM reduced MAE and RMSE at 36% and 45% from the 
traditional neural network algorithm. Therefore, the proposed algorithms based on historical data can 
improve forecasting PM2.5 concentration efficiently. For future work, improvement of the learning process is 
to determine the main factors of neural network prediction that helps to increase forecasting speed and 
accuracy. The use of evolutionary algorithms is another way for improving accuracy based on self- 
improvement forecasting. 
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